Identification of singleton mentions in Russian

نویسندگان

  • Max Ionov
  • Svetlana Toldova
چکیده

Аннотация This paper describes a pilot study of the problem of detecting singleton mentions in Russian texts. A noun phrase is considered a singleton mention if it is the only referent of some entity. We discuss various morphosyntactic and lexical features, some of which were used for analogous tasks for English and propose new features derived from the discourse analysis. Testing the machine learning classifiers trained with the use of proposed features, we conclude that although the quality of classifiers is significantly lower than for English, they still have rather high precision and thus can be helpful in various tasks of mention tracking.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Singleton Detection using Word Embeddings and Neural Networks

Singleton (or non-coreferential) mentions are a problem for coreference resolution systems, and identifying singletons before mentions are linked improves resolution performance. Here, a singleton detection system based on word embeddings and neural networks is presented, which achieves state-of-the-art performance (79.6% accuracy) on the CoNLL2012 shared task development set. Extrinsic evaluat...

متن کامل

The Life and Death of Discourse Entities: Identifying Singleton Mentions

A discourse typically involves numerous entities, but few are mentioned more than once. Distinguishing discourse entities that die out after just one mention (singletons) from those that lead longer lives (coreferent) would benefit NLP applications such as coreference resolution, protagonist identification, topic modeling, and discourse coherence. We build a logistic regression model for predic...

متن کامل

Constitutive Features of the Russian Political Discourse in Ecolinguistic Aspect

The article offers a comparative description of typological mechanisms used in political communicative practice and methods of verbal explication of its axiological and symbolic constituents determining universal mental features of individual/collective consciousness. The research position based on a systemic multilevel analysis of the component structure of discourse facilitates the identifica...

متن کامل

Kappa-casein Genotypic Frequencies in Russian Breeds Black and Red Pied Cattle

Casein is a family of milk proteins that exists in several molecular forms and is the main protein present inthe bovine milk. The B variant of bovine k-casein is reported to be favorable for quality and quantity of cheese derived from milk and considered to be included in breeding strategies of dairy cattle. Genotypes of72 Russian Black Pied and 80 Red Pied cows were determined for ...

متن کامل

MayAnd at SemEval-2016 Task 5: Syntactic and word2vec-based approach to aspect-based polarity detection in Russian

This paper describes aspect-based polarity detection system for Russian, used in aspectbased sentiment analysis task (ABSA) of SemEval-2016 (Task 5, subtask 1, slot 3). The system consists of two independent classifiers: for opinion target expressions and for implicit opinion target mentions. We introduce a set of standard unigram features together with more sophisticated ones: based on sentenc...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2017